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Abstract—FIPS 140-3 is the main standard defining security 
requirements for cryptographic modules in U.S. and Canada; 
commercially viable hardware modules generally need to be 
compliant with it. The scope of FIPS 140-3 will also expand 
to the new NIST Post-Quantum Cryptography (PQC) standards 
when migration from older RSA and Elliptic Curve cryptography 
begins. FIPS 140-3 mandates the testing of the effectiveness of 
“non-invasive attack mitigations”, or side-channel attack coun- 
termeasures. At higher security levels 3 and 4, the FIPS 140-3 
side-channel testing methods and metrics are expected to be those 
of ISO 17825, which is based on the older Test Vector Leakage 
Assessment (TVLA) methodology. We discuss how to apply ISO 
17825 to hardware modules that implement lattice-based PQC 
standards for public-key cryptography - Key Encapsulation 
Mechanisms (KEMs) and Digital Signatures. We find that simple 
“random key” vs. “fixed key” tests are unsatisfactory due to 
the close linkage between public and private components of 
PQC keypairs. While the general statistical testing approach 
and requirements can remain consistent with older public-key 
algorithms, a non-trivial challenge in creating ISO 17825 testing 
procedures for PQC is the careful design of test vector inputs so 
that only relevant Critical Security Parameter (CSP) leakage is 
captured in power, electromagnetic, and timing measurements. 

Index Terms—FIPS 140-3, Post-Quantum Cryptography, Side- 
Channel Attacks, DPA, DEMA, TVLA, ISO 17825 


I. TRANSITIONS: FIPS 140-3, SIDE-CHANNELS, AND PQC 


The U.S. & Canadian standard for the security of cryp- 
tographic modules is FIPS 140-3 [1]; after two decades, 
validation of implementations to the older FIPS 140-2 standard 
[2] ended in 2021. The new standard states that: 


Major changes in FIPS 140-3 are limited to the 
introduction of non-invasive physical requirements.! 


Non-invasive physical attacks use external physical measure- 
ments to derive secret information but do not modify the 
module’s state. They are commonly known as Side-Channel 
Attacks (SCA) and have been widely known since the 1990s; 
Timing Attacks (TA) [4] and Differential Power Attacks (DPA) 
[5] are the most common ones. 

Almost concurrently with the start of FIPS-mandated side- 
channel testing, NIST is transitioning to Post-Quantum Cryp- 
tography (PQC) standards for digital signatures and key 
establishment [6]. Hence designers of new high-assurance 
cryptographic modules for Government use are expected to 
meet both SCA and PQC requirements. 


'However, there have been other substantial changes in FIPS 140 testing, 
such as the much more comprehensive entropy source requirements [3]. 
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Il. ISO 17825 SIDE-CHANNEL TESTING 
A. Limitations 


ISO 17825 [7] presents its tests as repeatable, “push-button” 
tests with moderate costs. The test metrics determine confor- 
mance to ISO 19790 [8] (and, by extension, future versions 
FIPS 140-3 [1]) at Security Levels 3 and 4. The procedure 
primarily tests for statistical signs of first-order leakage of 
secret information (Critical Security Parameters, CSPs). Key 
recovery is not expected to be demonstrated. 

Due to its push-button leak-detection nature (without con- 
sideration for specific attacks), it is useful to think of ISO 
17825 as defining a “least common denominator” base level 
for side-channel security. It is not commensurable with an 
Evaluation Assurance Level (EAL) gained from Vulnerability 
Analysis (AVA_VAN) under the Common Criteria scheme [9]. 

The [7] non-invasive test metrics exclude attacks that 
modify the module’s state, either destructively or through 
temporary faults. The test control interfaces are similar to the 
cryptographic module’s external API. Since fault attacks are 
excluded, the main requirement for laboratory equipment is 
to issue commands synchronously and gather high-precision 
measurements of power consumption (for DPA), electromag- 
netic emissions (for DEMA), or for Timing Attacks (TA). 


B. General Test Procedure 


Our working assumption is that a PQC Implementation 
Under Test (IUT) will be required to meet a similar non- 
invasive security as RSA and Elliptic Curve (ECDSA) imple- 
mentations. The statistical methodology, calibration, and other 
technical requirements are largely the same as for older public- 
key cryptography in ISO 17825 [7] and ISO 20085 [10], [11]. 
Note that hybrid IUTs can implement both types of algorithms. 

The most readily applicable test is the “general statistical 
test procedure” [7, Figure 7]. The current version of these 
tests create data subsets A and B of measurements (e.g., trace 
waveforms) with the IUT, where input test vectors to A and 
B differ the way input CSPs are selected. 

Example: Subset A may use a fixed bit value in the CSP, 
while measurements in set B use random CSP values. If the 
A/B measurement sets can be distinguished from each other 
with the Welch t-test with high enough statistical confidence, 
this is taken as evidence of the leakage of the CSP and will 
result in a FAIL. If the confidence level is not reached, the 
test result is a PASS for the test vectors. 
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Outline of the General Statistical Test Procedure 


0. Determine the required sample size N = N4+Nz and 
t-test threshold C from the experiment parameters. 

1. Collect Subsets A and B and compute their pointwise 
averages (uA, Hp) and standard deviations (o4, op). 

2. Compute the pointwise Welch t-test statistic vector 


A HB 
jae e 

TA oR 

Na t+ Np 


3. If at any point |T| > C, the test results in a FAIL. 
If the threshold was is not crossed, the test is a PASS. 


The 2016 version of ISO 17825 [12] derived the sample size 
N (of waveforms) directly from the security level; 10, 000 for 
Level 3 and 100,000 for Level 4, and the statistical threshold 
value was fixed at C = +4.5. However, this leads to a 
statistically flawed experiment; for example, the family-wise 
error rate (FWER) was not considered [13], and hence the 
resulting confidence levels were not accurate. 

Under the newer versions of ISO 17825 [7], the security 
level determines (Cohen’s) standardized effect size; d = 0.04 
for Level 3 and d = 0.01 for Level 4. This, together with 
the false-positive rate a, false-negative rate 8 are used to 
determine the number of traces N and threshold C. Bonferroni 
correction is used to address FWER. 


III. PQC EXPERIMENT CLASSES 


We will focus on NIST finalist lattice-based KEMs and Sig- 
nature algorithms. At first, the tests may seem straightforward 
to set up; the private keys and shared secrets of NIST PQC 
algorithms are considered CSPs. But this creates a lot of false 
detections. NIST PQC private keys also contain components 
that can be derived from public information; those items are 
not CSPs, and detection of their “leakage” will create false 
positives. For purposes of our discussion, we exclude such 
public, shared items of information from the secret key, and 
both pk and sk are hence required for private-key operations. 

Example: The public matrix or ring element in lattice-based 
cryptography (typically noted with the letter A) is not assumed 
to be secret. Many algorithms derive A on-the-fly from a much 
shorter seed value using an Extensible Output Function (XOF) 
— and have the seed appear in both public and secret keys. The 
role of A is analogous to public modulus n in the RSA or a 
generator in Elliptic Curve crypto; they are needed for private 
key operations but are also public parameters and hence need 
to be excluded from the distinguishing process. 

There are additional, fundamental algorithmic differences 
between Post-Quantum Cryptography and older asymmetric 
algorithms such as RSA. For example, [7] states that “Asym- 
metric cryptography repeatedly uses elementary operations.” 
This is less true for lattice-based algorithms than it is for 
Elliptic Curve or RSA cryptography. The structure of the 
newer algorithms is less homogenous and consists of many 
special processing steps in addition to repeated arithmetic 
iterations. Devastating leakage can occur at almost any stage 
of the algorithm, and hence full-algorithm testing is essential. 


API Conventions: We define simple API abstractions for 
PQC algorithms at the “I/O Boundary” as required by the 
ISO 17825 standard. In practice, these APIs may need to be 
implemented as multiple discrete steps that set up or import 
keys securely for the operation. Export may also be in scope. 

We use (pk,sk) to denote public and private components 
of the keypair, preferably refactored in a way that sk contains 
only secret (CSP) parameters whose leakage leads to a security 
compromise. If masking is used, it is assumed that a fixed 
sk is remasked (“refreshed”) for each private key operation 
trace. Some APIs allow randomizers to be passed with the 
API, [seed] represents this. Such determinism is helpful for 
testing, but the seeds themselves are CSPs and must be handled 
appropriately in a side-channel context. 


A. Classes CPA and CCA: Key Establishment Mechanisms 


PQC Key Encapsulation Mechanisms (KEM) can be used 
in many different roles; for ephemeral key establishment (in 
interactive protocols such as TLS) and with long-term keys 
for public-key encryption (in applications such as e-mail), and 
for also for challenge-response authentication. All of these use 
cases have different security requirements. 

For technical definitions of CPA and CCA security, see [14]. 
CCA KEMs are often built on top of CPA components using 
the Fujisaki-Okamoto (FO) transform [15], [16]. The general 
test procedure is not adaptive and hence excludes interactive 
forms of tests (“experiments”) required for CPA and CCA 
security claims. The “CPA” and “CCA” terms are used here 
just to refer to the underlying components of KEMs. Security 
testing of CPA components is necessary to obtain assurances 
about the security of the CCA construct as well. 

Examples: Non-specific t-tests are commonly used in the 
literature to demonstrate basic security against side-channel 
attacks of PQC KEMs. Welch’s t-test method was used in 
[19] and [20] to demonstrate the security of masked Kyber 
implementations. However, these tests directly invoked inter- 
nal subcomponents — such interfaces may not be available for 
ISO 17825 testing. In [17] the t-test was also used in the non- 
specific “fix vs. random” A/B test mode to verify the security 
of the decapsulation component of SABER KEM against 
simple DPA (however, a neural network profiling/template 
attack was used later to attack the implementation [18]). 


CPA.API — Encrypt/Decrypt API Abstraction. 
In CPA testing, we use the CPA Encrypt/Decrypt subcom- 
ponent of KEM algorithms built using the FO transform. 


Initialize: (pk, sk) + CPA.KeyGen(([seed]) 
Public: ct + CPA.Encrypt(pk, m, [seed]) 
Private: m <+ CPA.Decrypt(ct, pk, sk) 


CPA.PD class: Public key distinguisher. 

The IUT is CPA.Decrypt(). The traces for set A use a fixed 
(but refreshed) keypair. Random, unique keypairs are used for 
each trace in set B. The ciphertexts ct are created (typically 
offline) using random messages m and matching keypairs to 
those used by IUT to decrypt. 
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CPA.PD.A: m = Random, (pk, sk) = Fixed 
CPA.PD.B: m = Random, (pk’, sk’) = Varying 


Discussion: CPA.PD does not appear to be very important 
in practice as the main implication of a FAIL is is that the 
attacker can identify the fixed public key used, which is usually 
inconsequential. However, a PASS on CPA.PD implies that the 
secret key is also protected; this is a very positive feature but 
may require unnecessary performance compromises. Imple- 
mentations are not expected to protect public parameters. 


CPA.SD class: Secret key distinguisher. 

The IUT is CPA.Decrypt(). All ciphertexts decrypted in set 
A are generated with a fixed (but refreshed) keypair (pk, sk). 
For set B a modified secret key sk’ is used. Random messages 
m are encrypted in both cases with pk to create the ciphertexts. 


Tweak: sk’ + CPA.KeyMod(pk, sk) 


CPA.SD.A: m = Random, (pk, sk) = Fixed 
CPA.SD.B: m = Random, pk = Fixed, sk’ = Varying 


Discussion: The CPA.SD test is applicable only if the raw 
private key operation CPA.Decrypt() does not return a FAIL 
but will generally yield an invalid m’ in case of an invalid 
secret key. This is an assumption that can be made with the 
underlying CPA decryption primitive of most Lattice-based 
KEMs. The test also assumes that the private and public 
key components have been refactored in a way that allows 
one to create a meaningfully modified secret key sk’ with 
differences limited to CSP components. In classical cases, 
one could simply flip a bit in the CSP / secret key but for 
PQC the required synthesis is more complex. The resulting 
sk’ is likely to be inconsistent with the public key, and hence 
a special test entry method may be needed. The modification 
must be meaningful also in the sense that it affects the result; 
CPA.Decrypt(ct, pk, sk) Æ CPA.Decrypt(ct, pk, sk’). 

We recommend that the (ct, pk,sk,sk’) test vector datasets 
for CPA.SD — or code for generating them — are published 
and shared to guarantee the consistency of the tests. 


CCA.API — Encapsulate/Decapsulate API Abstraction. 

PQC algorithms generally provide the basic Key Encapsula- 
tion Mechanism (KEM) API for their CCA security versions. 
There is no message m to select, but a random shared secret ss 
is created in encapsulation. The algorithms are often designed 
for implicit rejection; decapsulation may fail, but a synthetic 
ss value is generated instead of a return code [16]. 


Initialize: (pk, sk) + CCA.KeyGen(([seed]) 
Public: (ct, ss) + CCA.Encaps(pk, [seed]) 
Private: ss + CCA.Decaps(ct, pk, sk) 


CCA.PD and CCA.SD: One can define a CCA.PD public key 
distinguisher in an analogous manner to CPA.PD; however, 
failure of this test can be a result of leakage of public vari- 
ables, not CSPs. CCA.SD can be constructed using synthetic 
keypairs in a similar fashion to CPA.SD. We note that FO 
re-encryption will generally fail due to mismatch between pk 
and synthetic sk’, so CCA.SD has limited use. 


CCA.PC class: Plaintext Checking Distinguisher. 

The IUT is CCA.Decaps() and the decapsulation operation 
uses a fixed keypair (pk, sk) for all tests. The difference is in 
ciphertexts; Set A consists of valid ciphertexts, while set B 
ciphertexts are encapsulated with a mismatching public key 
pk’, pk’ # pk. 

CCA.PC.A: ct = CCA.Encaps(pk), matching sk 
CCA.PC.B: ct = CCA.Encaps(pk’), mismatch 


Discussion: This CCA test invokes the Plaintext Checking 
(PC) oracle in Fujisaki-Okamoto transform present in many 
CCA KEMs. The existence of a PC oracle generally implies 
that the scheme’s security is reduced from CCA to CPA level. 
It may also lead to compromise of CSPs [21]. 


B. Class Sig: PQC Signature testing 


The tests focus on the signature function (private key 
operations) as it is assumed that signature algorithms are used 
with long-term keys. Key generation is rare. 

Examples: Authors of [22] the use the TVLA / t-test to find 
leakage points in an implementation of the Dilithium signature 
algorithm. They then describe a masked variant of Dilithium 
and use the t-test to verify the security of components (“gad- 
gets”) against leakage. However, in ISO 17825 methodology, 
the tester may not be able to access subcomponents. 


Sig.API — Signature API abstractions. 
Initialize: (pk, sk) + Sig.KeyGen(|seed]) 


API with a signed message envelope sm of message m: 


Private: 
Public: 


Alternative API with a detached signature s of message m: 


sm + Sig.Sign(m, pk, sk, [seed]) 
{m, Fail} + Sig.Open(sm, pk) 


Private: 
Public: 


s + Sig.Signature(m, pk, sk, [seed] ) 
{Ok, Fail} + Sig.Verify(s, m, pk) 


Sig.MD class: Message distinguisher. 
The IUT is either Sig.Sign() or Sig.Signature(). Mes- 
sages m and m’ have equivalent length, e.g., 256 bits. 
Sig.MD.A: m’ = Random, (pk, sk) = Fixed 
Sig.MD.B: m = Fixed, (pk, sk) = Fixed 
Discussion: The message being signed may or may not 
be a CSP, depending on the application. For side-channel 
security we do not recommend deterministic signing that 
always produces the same signature, given the same m and 
keys. Dilithium can be operated in randomized mode too [24]. 


Sig.PD class: Public key distinguisher. The IUT is either 
Sig.Sign() or Sig.Signature(). Fixed and varying keypairs 
can be generated offline for the test. 

Sig.PD.A: m = Fixed, (pk,sk) = Fixed 

Sig-.PD.B: m = Fixed, (pk’, sk’) = Varying 

Discussion: Like with its CPA.PD counterpart, failure of 

Sig.PD test does not necessarily imply leakage of secret CSP 
information or other information leading to a signature forgery. 
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Sig.SD class: Secret key distinguisher. 

The IUT is either Sig.Sign() or Sig.Signature(). For this 
test, one needs to be able to modify the secret components sk 
of the keypair while causing minimal changes to the public val- 
ues pk; ideally, one constructs synthetic keypairs (pk, sk) and 
(pk, sk’) with differences only in the actual secret component. 
The exact procedure for such test vector generation depends 
on the algorithm under test, and should be standardized. 


Tweak: sk’ < Sig.KeyMod(pk, sk) 


Sig.SD.A: m = Fixed, pk = Fixed, sk’ = Varying 
Sig.SD.B: m = Fixed, pk = Fixed, sk = Fixed 
Discussion: As with CPA.SD and CCA.SD, the synthetic 

secret keys sk’ do not need to be completely random but can 
have only small differences to sk; the aim is to keep changes 
to the pk public-key parameters required for verification at a 
minimum. For example, a Dilithium [24] secret key contains 
(p, K, tr) parameters that can be derived from the public key — 
in addition to the secret vectors (s1, 2). Only the latter usually 
needs to be protected against leakage. A practically relevant 
testing procedure is able to perform private key operations with 
the same p public values but varying (s‘,, s4) parameters; and 
compare those traces against fixed (s1, s2). 


IV. CONCLUSIONS 


We predict that the FIPS 140-3 non-invasive security base- 
line ISO 17825 will apply very broadly to commercial hard- 
ware cryptography. It can be used to demonstrate the existence 
and coverage of side-channel countermeasures — but these 
PASS/FAIL leakage detection tests do not really measure the 
cost of an attack by an adversary. 

We have presented classes of simple ISO 17825 / TVLA 
“push button” test procedures for PQC KEM and Signature 
algorithms. Perhaps unsurprisingly, we find that side-channel 
leakage testing of PQC modules can’t be entirely “black box”, 
but requires carefully designed test vectors. 

The straightforward application of “random vs. fixed” key- 
pair distinguisher (CPA.PD, CCA.PD, Sig.PD) is not very 
meaningful as leakage of public key does not imply leakage of 
CSP information. Due to the complex linkage between private 
and public parts of the keypair in PQC, one needs a special 
method of constructing synthetic keypairs that modify only the 
secret CSP components without modifying the public ones. 

For PQC key establishment (KEMs) implemented with 
the Fujisaki-Okamoto paradigm, verification of direct leakage 
of CSPs can be obtained via the CPA Decrypt component 
(CPA.SD). One needs to also test for the presence of the plain- 
text checking oracle (CCA.PC) for partial CCA assurance. 

For digital signatures, one generally wants to implement 
randomized signatures instead of deterministic signatures; this 
can be verified with a message distinguisher (Sig.MD). As 
with KEMs, verification of direct secret leakage requires an 
understanding of the scheme to generate differentiated keys 
sk’ that can be used for CSP leakage detection (Sig.SD). 

Standardized PQC test vector generation procedures are 
needed to obtain consistent testing results in all cases. 
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